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THE PYRIN GENE AND MUTANTS THEREOF, WHICH CAUSE FAMILIAL MEDITERRANEAN FEVER 

Background of the Invention 

5 Field of the Invention 

This invention relates to a novel genomic DNA sequence (MEFV) encoding a 
protein (pyrin) associated with familial Mediterranean fever (FMF). More 
specifically, the invention relates to the isolation and characterization of MEFV, and 
the correlation of mutations in MEFV with FMF disease. 

10 

Background of the Invention 

Familial Mediterranean Fever (FMF) is a recessively inherited disorder 
characterized by dramatic episodes of fever, serosal inflammation and abdominal 
pain. This inflammatory disorder is episodic, with self-limited bouts of fever 
1 5 accompanied by unexplained arthritis, sterile peritonitis, pleurisy and/or skin rash. 
Patients often develop progressive systemic amyloidosis from the deposition of the 
acute phase reactant serum amyloid A (SAA). In some patients, progressive systemic 
amyloidosis can lead to kidney failure and death. The factors which incite an 
episode are unclear. 

20 FMF is observed primarily in individuals of non-Ashkenazi Jewish, 

Armenian, Arab and Turkish background. Although rare in the United States, 
incidence of FMF in Middle Eastern populations can be as high as 1 :7 in Armenian 
populations and 1 :5 in non-Ashkenazi Jewish populations, 

FMF attacks are characterized by a massive influx of polymorphonuclear 

25 leukocytes (PMNs) into the affected anatomic compartment. At the biochemical 

level, patients have been reported to have abnormal levels of C5a inhibitor (Matzner 
and Brzezinski, "C5a-inhibitor deficiency in peritoneal fluids from patients with 
familial Mediterranean fever," N. Engl. J. Med. , 3 1 1 :287-290 (1984)), neutrophil- 
stimulatory dihydroxy fatty acids (Aisen et al, "Circulating hydroxy fatty acids in 

30 familial Mediterranean fever," Proc. Natl. Acad. Sci. USA , 2:1232-1236 (1985)), 
and dopamine p-hydroxylase (Barakat et al, "Plasma dopamine beta-hyroxylase: 
rapid diagnostic test for recurrent hereditary polyserositis," Lancet , 2:1280-1283 
(1988)). Although linkage studies have placed the gene causing FMF (designated 
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MEFV) on chromosome 16p (Pras et al., "Mapping of a gene causing familial 
Mediterranean fever to the short arm of chromosome 16," N. Engl. J. Med. , 
326:1509-1513 (1992); Shohat et aL, "The gene for familial Mediterranean fever in 
both Armenians and non-Ashkenazi Jews is linked to the ct-globin complex on 16p: 
5 evidence for locus homogeneity," Am. J. Hum. Genet. , 51 :1349-1354 (1992); Pras et 
al, "The gene causing familial Mediterranean fever maps to the short arm of 
chromosome 16 in Druze and Moslem Arab families," Hum. Genet. , 94:576- 
577(1994); French FMF Consortium, "Localization of the familial Mediterranean 
fever gene (FMF) to a 250 kb-interval in non-Ashkenazi Jewish founder 

10 haplotypes," Am. J. Hum. Genet. , 59:603-612(1996)), the genetic basis of FMF has 
not previously been identified. 

Current treatment regimens for FMF include daily oral administration of 
colchicine. Although colchicine has been shown to cause near complete remission 
in about 75% of FMF patients and prevent amyloidosis, colchicine is not effective in 

15 all patients. Therefore, there is a need for new treatments for colchicine-resistant 
patients. 

Additionally, there is a need for an accurate diagnostic test for FMF. 
Patients having FMF in countries where the disease is less prevalent often 
experience years of attacks and several exploratory surgeries before the correct 
20 diagnosis is made. 



Summary of the Invention 

The invention provides a novel genomic nucleic acid sequence (MEFV) 
[SEQ ID NO: 1], shown in Figure 1 , encoding the protein pyrin which is associated 

25 with familial Mediterranean fever (FMF). The corresponding cDNA sequence (v75- 
1) [SEQ ID NO: 2] and encoded amino acid sequence [SEQ ID NO: 3] are shown 
in Figure 2. The invention is also directed towards fragments of the DNA sequence 
that are useful, for example, as hybridization probes for diagnostic assays or 
oligonucleotides for PCR priming. Additionally, the invention is directed towards 

30 the corresponding sequence for the RNA transcript and fragments thereof. 

Another aspect of the invention provides the amino acid sequence for a 
protein associated with FMF. This protein is called pyrin, to connote its relationship 
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to fever. The invention is directed towards both the full length amino acid sequence, 
fusion proteins containing the amino acid sequence and fragments thereof. These 
proteins are useful, for example, as antigens to produce specific anti-pyrin 
antibodies to be used as agents in diagnostic assays. Alternatively, the protein may 
5 be used in therapeutic compositions. 

Mutations in pyrin result in FMF. Therefore, the invention is also directed 
towards mutants of the nucleic acid and amino acid sequences associated with FMF. 
In particular, the invention discloses three missense mutations, clustered in within 
about 40 to 50 amino acids, in the highly conserved rfp (B30.2) domain [SEQ ID 

10 NO: 5] at the C-terminal of the protein. These mutants include M680I, M694V, 
K695R and V726A, each of which is associated with FMF. 

Additionally, the invention includes methods for diagnosing a patient at risk 
for having FMF using the nucleic acid and/or amino acid sequences of the invention. 
Such methods include, for example, hybridization techniques using nucleic acid 

1 5 sequences, PCR-amplification of MEFV, and immunoassays using anti-pyrin 
antibodies to identify mutations is MEFVot pyrin which are indicative of FMF. 

Brief Description of the Figures 

Figure 1 shows the genomic nucleic acid sequence for the gene associated with 
20 FMF; 

Figure 2 shows a cDNA sequence and deduced amino acid sequence corresponding 

to the gene associated with FMF; 
Figure 3 is a schematic representation of MEFV on chromosome 16pl3.3; 
Figure 4 show the expression profile of V75-1 ; 
25 Figure 5 shows the DNA sequences of the M6801, M694V and V726A mutants; and 
Figure 6 shows the alignment of multiple protein sequences with the C-terminal end 
of human pyrin. 

Detailed Description of the Invention 

30 The invention relates to the nucleic acid sequence encoding a protein 

associated with familial Mediterranean fever (FMF). The genomic DNA sequence is 
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designated MEFV. The corresponding cDNA sequence is designated as v75-l . The 
encoded protein is called pyrin, to connote its relationship to fever. The inventors 
have also discovered mutations in MEFV which are associated with FMF. 

It is believed that pyrin is a nuclear factor that controls the inflammatory 
response in differentiated polymorphonuclear leukocytes (PMNs). In particular, 
pyrin is believed to be a negative autoregulatory molecule in PMNs. Knowledge of 
the genetic basis of FMF enables the production of diagnostic assays for FMF and 
treatments for FMF and other inflammatory diseases which are characterized by 
accumulation of PMNs, for example, acute infectious disease such as those caused 
by bacterial infection (e.g., Pneumococcal pneumonia), autoimmune diseases such 
as Sweets Syndrome or Behcet's disease, chronic arthritis, and the like. 

The Nucleic Acid Sequence (MEFV) 

The inventors have discovered the nucleic acid sequence for the gene 
1 5 associated with FMF. The nucleic acid sequence is found on chromosome 16p. 

Specifically, MEFV is located at 16pl3.3 between the polycystic kidney disease gene 
(PKD1) and the tuberous sclerosis gene (TSC2) on the telomeric end, and the 
CREB-binding protein gene (CREBBP) on the centromeric end (see Figure 3). 

The genomic DNA sequence encoding pyrin (MEFV) [SEQ ID NO: 11 is 
20 shown in Figure 1 . The start methionine and stop codon are boxed, while the exons 
are underlined. The cDNA sequence (v75-l) [SEQ ID NO: 2) is shown in Figure 2. 
In Figure 2, the initial methionine and Kozak consensus sequences are underlined. 
The first boxed segment is a bZIP transcription factor basic domain. The second 
boxed segment is a Robbins/Dingwall consensus nuclear targeting signal. The 
25 segment indicated by +'s is a potential B— box zinc finger domain. The double- 
boxed region encloses a sequence which encodes a rfp, or B30.2, domain [SEQ ID 
NO: 4]. Within the double boxed region (the rfp or B30.2 domain), the nucleic 
acids encoding three FMF— associated mutations are double-underlined. Sites of 
synonymous single nucleotide polymorphisms are represented by the cents symbol 
30 "0" above the sequence. 

Although there is an excellent Kozak consensus sequence (Kozak, 
"Interpreting cDNA sequences: some insights from studies on translation," Mamm. 
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Genome , 7:563-574 (1996)) at the initial methionine (accATGG), the reading frame 
remains open in the cDNA upstream. Because there are no splice-acceptor 
consensus sequences or in-frame methionines with good Kozak sequences before the 
first stop upstream in the genomic DNA, the initial methionine remains the most 
5 likely starting methionine. 

The RNA Transcript 

The estimated transcript size from the nucleic acid sequence shown in Figure 
2 is about 3503 nucleotides. The transcript size determined by Northern blotting is 
10 3.7 kb. (See Example 4). The fact that the transcript size estimated from the 
sequence shown in Figure 2 approximates the size of the transcript found in 
experimental procedures further indicates that the sequence shown in Figure 2 is the 
full-length cDNA sequence. 

1 5 The Encoded Protein 

The inventors have also discovered the amino acid sequence for the protein 
associated with FMF (pyrin). Pyrin is predicted to be 781 amino acids in length and 
very positively charged. The pi is predicted to be greater than 8 (pi > 8), in part due 
to the fact that lysine and arginine residues make up 13% of the amino acid 
20 composition. 

The predicted amino acid sequence for pyrin [SEQ ID NO: 3] is shown in 
Figure 2. The boxed segment from amino acid 266 to 280 is a bZIP transcription 
factor basic domain. The boxed segment from amino acid 420 to 437 is a 
Robbins/Dingwall consensus nuclear targeting signal. The segment indicated by +'s 

25 between residues 375 and 407 is a potential B-box zinc finger domain. The region 
double-boxed from residue 577 to 757 is a rfp, or B30.2, domain [SEQ ID NO: 5]. 
The rfp (B30.2) domain is conserved (sequence identity 40 - 60%) in molecules as 
diverse as butyrophilin (a milk protein with probable receptor function; Jack and 
Mather, "Cloning and molecular analysis of cDNA encoding bovine butyrophilin, an 

30 apical glycoprotein expressed in mammary tissue and secreted in association with 
the milk-fat globule membrane during lactation," J. Biol. Chem. , 265:14481-14486 
(1990)), A33 (a factor that binds polytene chromosomes in the newt; Bellini et al., 
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"A putative zinc-binding protein on lampbrush chromosome loops," EMBO J.. 
12:107-1 14 (1993)), and xnf7 (a factor that binds mitotic chromosomes in the frog; 
Reddy et al., "The cloning and characterization of a maternally expressed novel zinc 
finger nuclear phosphoprotein (xnf7) in Xenopus laevis," Dev. BioL. 148:107-1 16 
5 ( 1 99 1 )) and, by an analysis with the SEG algorithm ( Wootton, "Non-globular 

domains in protein sequences: automated segmentation using complexity measures," 
Comput. Chem.. 18:269-285 (1994)), most likely assumes a globular conformation. 
Within the double boxed region (the rfp or B30.2 domain), three of the amino acids 
that have been found mutated in FMF patients are double— underlined. 

10 Positions of secondary structural elements were predicted by the profile 

neural network method PHDsec (Rost and Sander, "Prediction of protein secondary 
structure at better than 70% accuracy," J. Mol. Biol. . 232:584-599 (1993); Rost and 
Sander, "Combining evolutionary information and neural networks to predict protein 
secondary structure," Proteins , 19:55-72 (1994)). The secondary structural elements 

15 in wild type pyrin (all p-sheets) as are shown as bold, horizontal arrows in Figure 6. 

Expression 

Pyrin is predominantly expressed in mature granulocytes and/or serosal cells. 
As shown in the Northern blots in Figure 4, high levels of pyrin are expressed in 

20 peripheral blood leukocytes (granulocytes), but not in lymph nodes, bone marrow, 
monocytes, lymphocytes, spleen or thymus (See Figure 4). Because granulocytes 
accumulate in tissues experiencing inflammation during a FMF episode, expression 
of pyrin in granulocytes is consistent with the clinical phenotype for FMF. 

The restriction of pyrin to granulocytes, its apparent localization in the 

25 nucleus, and the phenotype associated with mutations tends to indicate that pyrin is a 
nuclear factor that controls the inflammatory response in differentiated PMNs. 
Additionally, the inventors found that pyrin shares homology with a number of 
molecules implicated in inflammation, such as rpt-1 (a known downregulator of 
inflammation). In view of the fact that FMF is a disease of excessive inflammation, 

30 and that pyrin shares homology to a known downregulator of inflammation, pyrin is 
believed to be a negative autoregulatory molecule in PMNs. 
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Homologies 

Pyrin shares homology with a number of molecules implicated in 
inflammation including 52 kd Ro/SS A ribonucleoprotein (patients with systemic - 

5 lupus erythematosus (SLE) and Sjogren's syndrome frequently make autoantibodies 
against this ribonucleoprotein); Staf-50 (an interferon-inducible transcriptional 
regulator; Tissot and Mechti, "Molecular cloning of a new interferon-induced factor 
that represses human immunodeficiency virus type 1 long terminal repeat 
expression," J. Biol. Chem. , 270:14891-14898 (1995)); and rpt-1 (a mouse 

10 ddwnregulator of IL-2; Patarca et al., "rpt-1, an intracellular protein from 

helper/inducer T cells that regulates gene expression of interleukin 2 receptor and 
human immunodeficiency virus type 1," Proc. Natl. Acad. Sci. USA , 85:2733-2737 
(1988)). 

The homology between pyrin and rpt-1 is found in a domain extending from 
1 5 residues 385 - 550 on pyrin. Pyrin shows particularly high homology to many 

proteins, including 50 kdRo/SS A and Staf-50, at the C-terminal end, the rfp (B30.2) 
domain. Figure 6 shows the alignment of the C-terminal end of human pyrin with 
multiple sequences having statistical similarity as assessed by BLAST (Altschul et 
al., supra). Search cutoffs used to identify homologs were a Karlin-Altschul score 

20 of two aligned sequences ^ 70 with a probability £ 10"^. At each position, residues 
occurring in a majority of the sequences are shown in inverse type. The numbering 
scheme at the top of the figure is based on the sequence of pyrin. 

The B-box zinc finger and rfp (B30.2) domain combination observed in pyrin 
is also seen in 52 kd Ro/SS A and ret finger protein. The spacing between the B-box 

25 zinc finger and the rfp (B30.2) domain is highly conserved, suggesting that precise 
orientation of the two domains with respect to one another may be required for 
function. 

Mutants 

30 The inventors have also discovered missense mutations that are found in 

individuals affected with FMF, but not found in any of a large panel of normal 
control chromosomes. The missense mutations are clustered within about 40 to 50 
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amino acids (including residues 680 through 726) in the highly conserved rfp 
(B30.2) globular domain. It is believed that the mutations affect the secondary 
structure of this domain and result in a structural change that prevents the normal 
pyrin-mediated negative feedback loop. 
5 A first mutation associated with FMF is a G O C transversion at nucleotide 

2040 which results in the substitution of isoleucine for methionine (M680I). A 
second mutation is an A O G transition at nucleotide 2080 which results in the 
substitution of valine for methionine (M694V). A third mutation is a T O C 
transition at nucleotide 2177 which results in the substitution of alanine for valine 

10 (V726A). Additionally, the inventors have discovered a fourth mutation at position 
695 which results in the substitution of Arginine for Lysine (K695R). 

It is believed that phenotypic variation in FMF may be attributable to the 
differences between mutations. For example, the M694V mutation is very common 
in populations with the highest incidence of systemic amyloidosis (especially North 

1 5 African Jews). On the other hand, V726A is seen in populations in which amyloid is 
less common (Iraqi and Ashkenazi Jews, Druze and Armenians). 

Figure 5 shows DNA sequence electropherograms, produced by amplifying 
exon 10 genomic DNA and sequencing, which demonstrate the M680I, M694V, and 
V726A substitutions. For each mutation, individuals who are homozygous for the 

20 normal allele are shown at the top, heterozygotes between the normal and mutant 
allele are shown in the middle, and homozygotes for the mutation are shown at the 
bottom. 

None of these mutations result in a truncated protein. This is consistent with 
the periodic nature of the inflammatory attacks in FMF. Other diseases with 

25 periodic episodes are associated with a protein that functions adequately at steady 
state, but decompensates under stress, such as sickle cell anemia (Weatherall et al., 
"The hemoglobinopathies," In The Metabolic and Molecular Bases of Inherited 
Disease , Scriver et al, eds., New York, McGraw-Hill, pp. 3417-3484 (1995) and 
hyperkalemic periodic paralysis (Ptacek et al., "Identification of a mutation in the 

30 gene causing hyperkalemic periodic paralysis," Cell , 67:1021-1027 (1991)). 
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Diagnostic Methods 

The sequences provided by this invention can be used in methods for 
diagnosis of risk for developing FMF. As used herein, an individual is "at risk" for 
developing FMF when the individual has a mutant MEFV nucleic acid sequence 

5 which results in expression of mutant pyrin, particularly where the amino acid 

mutation occurs in the highly conserved rfp (B30.2) Oterminal domain. Mutations 
include substitutions of one nucleic acid with a different nucleic acid. In contrast, a 
patient having wild type MEFV nucleic acid sequence expressing wild type pyrin is 
not at risk for developing FMF. As used herein, "wild type" refers to a dominant 

10 genotype which naturally occurs in the normal population (i.e., members of the 
population not afflicted with familial Mediterranean fever). Thus, methods for 
identifying an individual's specific nucleic acid or amino acid sequence are useful 
for determining risk of FMF. Specifically, a method for determining whether an 
individual's nucleic acid sequence encodes a wild type or mutant pyrin is useful in 

1 5 determining whether the individual is at risk for developing FMF. 

Many methods for analysis of an individuals nucleic acid or amino acid 
sequences are known to those of skill in the art, and include, for example, direct 
sequencing, ARMS (amplification refractory mutation system), restriction 
endonuclease assays, oligonucleotide hybridization techniques, and immunoassays. 

20 While some commonly used procedures are exemplified below, the inventors are 
aware that other methods are available and include them within the scope of their 
invention. 

Southern Blot Techniques 

25 In Southern blot analysis, DNA is obtained from an individual and then 

separated by gel electrophoresis. Following electrophoresis, the double stranded 
DNA is converted to single stranded DNA, for example, by soaking the gel in 
NaOH. The DNA is then transferred to a sheet of nitrocellulose. The DNA is then 
contacted with a labeled probe. For example, labeled probe can be applied to the 

30 nitrocellulose after it dries. As used herein, a "probe" is a nucleic acid sequence that 
is complementary to the sequence of interest. The probe can be either a DNA 
sequence or an RNA sequence. Preferably the probe is about 8 to 16 nucleotides in 
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length. A radioactive label, such as 32 P is an example of a suitable label. Other 
suitable labels include fluorophores or an enzyme which catalyzes a color producing 
reaction (e.g., horse radish peroxidase). Because the probe has complementary 
sequence to the DNA sequence of interest, it will hybridize to the specific DNA 
5 sequence. As used herein, "hybridize" means that the probe will form a double- 
stranded molecule with the specific DNA sequence by complementary base pairing 
under conditions of high stringency (e.g., 65°C; 0.1 x SSC; Sambrook et al., 
Molecular Cloning. A Laboratory Manual, Cold Spring Harbor, New York: Cold 
Spring Harbor Press (1989)). After the probe is allowed to hybridize to the DNA, 

1 0 excess probe is washed away. The hybridized DNA is easily visualized from the 

labeled probe using known techniques. Hybridization of the probe indicates that the 
sample DNA contains a sequence that is complementary to the labeled probe. In a 
preferred method, hybridization probes are designed from the MEFV nucleic acid 
sequences, and particularly, from the C-terminal MEFV sequence encoding the rfp 

1 5 (B30.2) globular domain. 

It is often desirable to amplify the sample DNA for more efficient analysis. 
Polymerase chain reaction (PCR) can be used to amplify the DNA. PCR is a 
technique that is well known to one of skill in the art. An exemplary method 
includes developing oligonucleotide primers that hybridize to opposite strands of 

20 DNA flanking the MEFV gene. As used herein, a "primer" is a short nucleotide 

sequence which is complementary to a DNA sequence flanking the DNA sequence 
of interest. Preferably the primer is about 15 to 20 nucleotides in length. The 
specific fragment defined by the primers exponentially accumulates by repeated 
cycles of denaturation, oligonucleotide primer annealing and primer extension. In a 

25 preferred embodiment, the PCR primers amplify the region encoding the rfp (B30.2) 
globular domain. The amplified domain can then be analyzed by hybridization or 
screening techniques. 

For example, oligonucleotide primers are developed to amplify MEFV, the 
rfp (B30.2) domain, or a fragment thereof, such as the preferred 40 to 50 amino acid 

30 fragment of the rfp (B30.2) domain discussed above. Suitable oligonucleotide 
primers, such as "Exon 10A Forward and Reverse", "Exon 10B Forward and 
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Reverse", and "Exon 10B Forward and Exon 10A Reverse", are shown in Example 
1. 



Northern Blot Techniques 
5 The presence of a wild type or mutant RNA transcript may be determined by 

Northern Blot Techniques, following a procedure similar to that outlined for the 
Southern Blot Technique. 



Western Blot Techniques 

10 The presence of a wild type or mutant protein from the highly conserved C- 

terminal rfp (B30.2) region can be detected by immunoassay, for example by 
Western Blot Techniques. In this procedure, a tissue sample is obtained from an 
individual and separated by gel electrophoresis. Following electrophoresis, the 
proteins are then transferred to nitrocellulose. The proteins are then contacted with a 

1 5 labeled probe, for example, by applying the labeled probe to the nitrocellulose after 
it is dried. Suitable probes include labeled anti-pyrin antibodies, preferably those 
antibodies specific for an epitope in the highly conserved C-terminal rfp (B30.2) 
domain. Exemplary labels include radioactive isotopes, enzymes, fluorophores and 
chromophores. Because it is believed that mutants in the highly conserved C- 

20 terminal domain alter the secondary structure of the domain, an antibody specific for 
the wild-type protein should not bind to or recognize a protein having a mutation in 
this highly conserved region. Conversely, an antibody specific for a mutant protein 
does not recognize or bind to the wild type. After excess antibody is rinsed away, 
the presence of the specific protein/antibody complex is easily determined by known 

25 methods, for example by development of the label attached to the anti-pyrin 
antibody, or by the use of secondary antibodies. 



Sequencing Techniques 

Alternately, DNA, RNA or protein obtained from an individual can be 
30 sequenced by known methods, and compared to the wild type sequence. Mutations 
recognized in the sequence, particularly, in the rfp (B30.2) domain indicate risk for 
developing FMF. 
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ARMS 

ARMS (amplification refractory mutation system) is a PCR based technique 
in which an oligonucleotide primer that is complementary to either a normal allele or 

5 mutant allele is used to amplify a DNA sample. In one variation of this method, a 
pair of primers is used in which one primer is complementary to a known mutant 
sequence. If the DNA sample is amplified, the presence of the mutant sequence is 
confirmed. Lack of amplification indicates that the mutant sequence is not present. 
In a different variation, the primers are complementary to wild type sequences. 

10 Amplification of the DNA sample, indicated that the DNA has the wild type 

sequence complementary to the primers. If no amplification occurs, the DNA likely 
contains a mutation at the sequence where hybridization should have occurred. A 
description of ARMS can be found in Current Protocols in Human Genetics , Chapter 
9.8, John Wiley & Sons, ed by Dracopoli et al. (1995). 

15 

Restriction Endonuclease Assays 

Restriction endonuclease assays can also be used to screen a DNA sample for 
mutants, such assays are used by Pras et al., "Mutations in the SLC3A1 transporter 
gene in Cystinuriar Am. J. Hum. Genet. , 56:1297-1303 (1995). Briefly, a DNA 
20 sample is amplified and then exposed to restriction endonucleases that will or will 
not cleave the DNA depending on whether or not a mutation is present. After 
cleavage, the size of restriction fragments are observed to determine whether or not 
cleavage occurred. 

25 Oligonucleotide Hybridization Techniques 

Hybridization techniques, such as dot blots, are known to one of skill in the 
art and can be used to determine whether a DNA sample contains a specific 
sequence. In a dot blot, a DNA sample is denatured and exposed to a labeled probe 
which is complementary for a wild type sequence or a mutant sequence. 

30 Hybridization of a probe that is complementary to the wild type sequence (a '"wild 
type probe") indicates that the wild type sequence is present. If the wild type probe 
does not hybridize to the DNA in the sample, the wild typ sequence is not present. 
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In a variation of this technique a probe that is complementary to a know mutant 
sequence can be used. A discussion of allele specific oligonucleotide testing can be 
found in Current Protocols in Human Genetics , Chapter 9.4, supra. 



5 Immunological Assays 

An immunological assay, such as an Enzyme Linked Immunoassay (ELISA), 
can be used as a diagnostic tool to determine whether or not an individual is at risk 
for developing FMF. One of skill in the art is familiar with the procedure for 
performing an ELISA. Briefly, antibodies are generated against native or mutant 

10 pyrin. This can be accomplished by administering a native or mutant protein to an 
animal, such as a rabbit. The anti-pyrin antibodies are purified and screened to 
determine specificity. In one representative example of an immunoassay, wells of a 
microtiter plate are coated with the specific anti-pyrin antibodies. An aliquot of a 
sample from a patient to be analyzed for pyrin is added in serial dilution to each 

15 antibody coated well. The sample is then contacted with labeled anti-pyrin 

antibodies. For example, labeled anti-pyrin antibodies, such as biotinylated anti- 
pyrin antibodies, can be added to the microtiter plate as secondary antibodies. 
Detection of the label is correlated with the specific pyrin antigen assayed. Other 
examples of suitable secondary antibody labels include radioactive isotopes, 

20 enzymes, fluorophores or chromophores. The presence of bound labeled 

(biotinylated) antibody is determined by the interaction of the biotin with avidin 
coupled to peroxidase. The activity of the bound peroxidase is easily determined by 
known methods. 

25 Production of Pyrin 

The nucleic acid sequence encoding wild type or mutant pyrin can be used to 
produce pyrin in cells transformed with the sequence. For example, cells can be 
transformed by known techniques with an expression vector containing v75-l cDNA 
sequence operably linked to a functional promoter. Expression of pyrin in 

30 transformed cells is useful in vitro to produce large amounts of the protein. 

Expression in vivo is useful to provide the protein to pyrin-deficient cells. Examples 
of suitable host cells include animal cells such as bacterial or yeast cells, for 
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example, E. coli. Additionally, mammalian cells, such as Chinese hamster ovary 
(CHO) cells can be used. Human cells, such as SW480 colorectal adenocarcinoma 
can also be used as host cells. 

Due to degeneracy of the genetic code, most amino acids are encoded by 
5 more than one codon. Therefore, applicants recognize, and include within the scope 
of the invention, variations of the sequence shown in SEQ ID NO: 1. For example, 
codons in a DNA sequence encoding pyrin can be modified to reflect the optimal 
codon frequencies observed in a specific host. Rare codons having a frequency of 
less than about 20% in known sequences of the desired host are preferably replaced 

1 0 with higher frequency codons. 

Additional sequence modifications are known to enhance protein expression 
in a cellular host. These include elimination of sequences including spurious 
polyadenylation signals, exon/intron splice site signals, transposon-like repeats, and 
other well characterized sequences which may be deleterious to gene expression. 

1 5 The G-C content of a sequence may be adjusted to levels average for a given cellular 
host, as calculated by reference to known genes expressed in the host cell. Where 
possible, the sequence is modified to avoid predicted hairpin secondary mRNA 
structures. The genomic sequence might additionally be modified by the removal of 
introns. 

20 

Transgenic Animals 

The nucleic acid sequences encoding pyrin, both wild-type and mutant, 
provided in this application are useful for the development of transgenic animals 
expressing pyrin. Such transgenic animals are used, for example, to screen 
25 compounds for treating FMF or inflammation. 

Useful variations of a transgenic animal are "knock out" or "knock in" 
animals. In a "knock out" animal, a known gene sequence, such as the sequence 
encoding pyrin, is deleted from the animal's genome. Experiments can be 
performed on the animal to determine what effect the absence of the gene has on the 
30 animal. In a "knock in" experiment, the wild type gene is deleted and a mutant 

version or a gene from another organism is inserted therefore. Experiments can be 
performed on the animal to determine the effects of this transition. 
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Kits 

The invention is also directed towards a kit for diagnosing risk of FMF. A 
suitable diagnostic kit includes a nucleic acid sequence encoding wild-type pyrin 

5 and at least one nucleic acid sequence encoding mutant pyrin. An alternative kit 
includes an anti-pyrin antibody which binds to wild-type pyrin and at least one anti- 
pyrin antibody which binds to mutant pyrin. A kit also preferably contains at least 
one pair of amplification primers capable of amplifying a nucleic acid sequence 
encoding pyrin. Preferably, the primers amplify a nucleic acid sequence encoding a 

1 0 rfp (B30.2) domain of pyrin. 

The present invention may be better understood with reference to the 
following examples. These examples are intended to be representative of specific 
embodiments of the invention, and are not intended as limiting the scope of the 
invention. 

15 

Examples 

The DNA samples used in the following examples were extracted from 
whole blood or from Epstein-Barr virus-transformed lymphocytes by standard 

20 techniques. The DNA was obtained from forty-four families of non-Ashkenazi 

Jewish descent (18 Moroccan, 14 Libyan, 5 Tunisian, 2 Egyptian and 5 Iraqi) and 5 
Arab/Druze families (identified and sampled at the Chaim Sheba Medical Center in 
Tel-Hashomer, Israel). Additionally, twelve Armenian families were recruited from 
Cedars-Sinai Medical Center in Los Angeles. One Ashkenazi/Iraqi Jewish family 

25 was also studied. 

The diagnosis of FMF in all families was according to established clinical 
criteria (Sohar et al., "Familial Mediterranean fever: a survey of 470 cases and 
review of the literature " Am. J. Med. , 43 :227-253 (1967)). 
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Example 1. Positional Cloning 

A positional cloning approach was used to clone a new cDNA (v75-l) from 
5 the FMF candidate region on chromosome 16pl3.3. Mutational analysis indicates 

the v75-l is the gene (designated MEFV) expressing pyrin, mutations of which are 

associated with FMF disorder. 

Publicly available polymorphic markers (discussed below) were used to 

narrow the candidate region on chromosome 16p to an approximately 1 Mb interval 
1 0 between D16S94 and D16S2622 (Sood et ah, "Construction of a 1-Mb restriction 

mapped cosmid contig containing the candidate region for the familial 

Mediterranean fever locus (MEFV) on chromosome 16pl3.3," Genomics , 42:83-95 

(1997)) lying between the polycystic kidney disease (PKD1) and tuberous sclerosis 

(TSC2) genes on the telomeric end, and the CREB-binding protein (CREBBP) gene 
15 on the centromeric end (see Figure 3). Because physical maps constructed around 

these genes did not extend into the MEFV region, a contig was constructed which 

spanned the candidate region. 

Attempts to construct a mega YAC (yeast artificial chromosome) contig 

spanning the MEFV candidate region were unsuccessful due to the instability of 
20 YAC clones from this region of chromosome 16. Instead, a cosmid map was 

assembled by iterative screening of a flow sited chromosome 16 specific cosmid 

library. D16S246 was the telomeric starting point of the chromosomal walk. 

Identification of recombinants at D16S2622 enabled us to use this microsatellite 

marker as the centromeric boundary (Sood et al., 1997, supra). 
25 Observed recombinations of microsatellite markers in a panel of 61 families 

defined a critical region of 285 kb (JD16S468 - D16S3376). 

By analysis of the genomic sequence from this region, two new 

microsatellites, D16S3404 and D16S3405 (Figure 3B), were found in the center of 

the D16S3082 - D16S3373 interval. In one non-Ashkenazi Jewish family, evidence 
30 of a historical recombination event between D16S3404 and D16S3405 in the highly 

conserved non-Ashkenazi Jewish haplotype (designated haplotype A) was observed. 

Therefore, the region telomeric of D16S3405 (and 4 candidate genes encoded 
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therein) were excluded from further consideration. The discovery of the two new 
microsatellites and the historical recombination event further refined of the 
candidate interval to the centromeric-most 115 kb. 

A combined strategy of exon amplification, direct cDNA selection, and 
5 single-pass sequencing led to the isolation of 9 full length cDNA clones. The 

furthest centromeric cDNA clone, v75-l, was isolated by solution hybridization of a 
leukocyte cDNA library with biotinylated oligonucleotide probes derived from two 
exons trapped from PAC 273L24. 

10 Exon Trapping 

PAC (PI artificial chromosome) clone 273L24 (Genome Systems; St. Louis) 
includes the centromeric-most 1 15 kb. Therefore, exon trapping was performed on 
PAC clone 273L24. Exon trapping was performed essentially as described by 
Buckler et al., "Exon amplification: a strategy to isolate mammalian genes based on 
15 RNA splicing," Proc. Natl. Acad. Sci. USA , 88:4005-4009 (1991). Essentially, PAC 
clone 273L24 was partially digested with Sau 3AI (commercially available, for 
example, from New England Biolabs). The reaction products were size fractionated 
by agarose gel electrophoresis and DNA fragments 2 kb and larger were isolated 
from the gel. Fifty ng of partially digested DNA was ligated with 10 ng of exon 
20 trapping vector pSPL3 (Exon Trapping System; Life Technologies, Gaithersburg, 
MD) that had been previously cleaved with Bam HI (commercially available) and 
dephosphorylated with calf intestinal alkaline phosphatase (Promega, Madison, WI). 
Ligation products were electroporated into E. coli DH12B (Life Technologies, 
Gaithersburg, MD) The electroporated cells were cultured en mass in LB broth with 
25 200 mg/ml ampicillin for 16 hours at 37° C with shaking. 

DNA prepared from the culture was used to transfect COS-7 cells (ATCC 
30-2002) using lipofectACE reagent (Life Technologies, Gaithersburg, MD). Total 
RNA was isolated from transfected COS-7 cells with Trizol reagent (Life 
Technologies) followed by ethanol precipitation. 
30 First strand cDNAs of transcription products from pSPL3 were primed with 

the oligonucleotide SA2 (Exon Trapping System; Life Technologies, Gaithersburg, 
MD). Specific amplification of trapped exons was as follows: PCR primed with 



BNSDOCID: <WO 9909169A1 I > 



BNS oaae 1 



1 



WO 99/09169 PCT/US98/17255 

18 

oligonucleotides SA2 and SD6 (Exon Trapping System; Life Technologies, 
Gaithersburg, MD) was performed, followed by digestion of the PCR products with 
Bst XI (commercially available). 

A second PCR reaction using the digestion products was primed with 
5 oligonucleotides dUSD2 and dUSA4 (Exon Trapping System; Life Technologies, 
Gaithersburg, MD). The resulting DNA fragments were cloned into pAMPIO vector 
(Exon Trapping System; Life Technologies, Gaithersburg, MD) and sequenced. 
Two hundred clones were sequenced and 20 independent exons were identified by 
visual inspection and hybridization to DNA fragments from the FMF critical region, 
10 with several exons identified more than one time. 

Oligonucleotides for Exon Amplification 

Oligonucleotides used to amplify pyrin exons were as follows (all oligo 
sequences are given 5' to 3'): 
15 Exon 1 forward, AAC CTG CCT TTT CTT GCT CA; [SEQ ID NO: 6] 
Exon 1 reverse, CAC TCA GCA CTG GAT GAG GA; (SEQ ID NO:7] 
Exon 2A forward, ATC ATT TTG CAT CTG GTT GTC CTT CC; [SEQ ID NO:8] 
Exon 2 A reverse, TCC CCT GTA GAA ATG GTG ACC TCA AG; [SEQ ID 
NO:9] 

20 Exon 2B forward, GGC CGG GAG GGG GCT GTC GAG GAA GC; [SEQ ID 
NO: 10] 

Exon 2B reverse, TCG TGC CCG GCC AGC CAT TCT TTC TC; [SEQ ID 
NO:lll 

Exon 3 forward, TGA GAA CTC GCA CAT CTC AGG C; [SEQ ID NO: 12] 
25 Exon 3 reverse, AAG GCC CAG TGT GTC CAA GTG C; [SEQ ID NO: 13] 
Exon 4 forward, TTG GCA CCA GCT AAA GAT GGC; [SEQ D3 NO: 14] 
Exon 4 reverse, TCT CCC TCT ACA GGG ATG AGC; [SEQ ID NO: 15] 
Exon 5 forward, TAT CGC CTC CTG CTC TGG AAT C; [SEQ ID NO: 16] 
Exon 5 reverse, CAC TGT GGG TCA CCA AGA CCA AG; [SEQ ID NO: 17] 
30 Exon 6 forward, TCC AGG AGC CCA GAA GTA GAG; [SEQ ID NO: 18] 
Exon 6 reverse, TTC TCC CTA TCA AAT CCA GAG; [SEQ ID NO: 19] 



Exon 7 forward, AGA ATG TAG TTC ATT TCC AGC; [SEQ ID NO: 20] 
Exon 7 reverse, CAT TTC TGA ACG CAG GGT TT; [SEQ ID NO: 21] 
35 Exon 8/9 forward, ACC TAA CTC CAG CTT CTC TCT GC; [SEQ ID NO: 22] 
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Exon 8/9 reverse, AGT TCT TCT GGA ACG TGG TAG; [SEQ ID NO: 23] 
Exon 10A forward, CCA GAA GAA CTA CCC TGT CCC; [SEQ ID NO: 24] 
Exon 10A reverse, AGA GCA GCT GGC GAA TGT AT; [SEQ ID NO: 25] 
Exon 10B forward, GAG GTG GAG GTT GGA GAC AA; [SEQ ID NO: 26] 
5 Exon 10B reverse, TCC TCC TCT GAA ATC CAT GG. [SEQ ID NO: 27] 



Direct cDNA selection 

Direct cDNA selection was used to isolate 2 full-length cDNA clones 
(Parimoo et al., "cDNA selection: efficient PCR approach for the selection of 

10 cDNAs encoded in large chromosomal DNA fragments," Proc. Natl. Acad. Sci. 

USA , 88:9623-9627 (1991). Cosmids, BAC (bacterial artificial chromosome) and 
PI clones in the FMF candidate region were biotinylated using BioPrime (Life 
Technologies, Gaithersburg, MD). cDNAs were prepared from combined mRNA 
from fetal brain, fetal liver, and human lymph node by reverse transcription and 

1 5 ligation of an EcoRI/NotI adaptor to second strand cDN As. 

cDNAs were directly hybridized to biotinylated templates which were 
recovered using streptavidin-labeled magnetic beads. Conditions for blocking, 
hybridization, binding and elution of cDNAs from magnetic beads (Dynal) were as 
described by Parimoo et al., supra. After two rounds of selection, eluted cDNAs 

20 were amplified with CUA-tailed EcoRI/NotI adaptor primers and subcloned into the 
pAMPIO vector (Life Technologies, Gaithersburg, MD) to yield libraries of selected 
cDNAs. 

Recombinant clones were arrayed on blots. Clones that hybridized to either 
repetitive or ribosomal sequences were excluded from further analysis. To confirm 

25 their origin, unique clones were individually hybridized to EcoRI digests of 

cosmid/B AC/PI DNAs and DNAs from chromosome 16-specific human-hamster 
hybrid lines. Clones were then hybridized to each other and were binned into 
groups. Representative clones of each group were hybridized to multiple tissue 
Northern blots and sequenced. 

30 cDNA Identification bv Solution Hybridization 

Following the protocol provided in the Gene Trapper kit, the furthest 
centromeric cDNA, clone v75-l, was isolated by solution hybridization of a 
leukocyte cDNA library with biotinylated oligonucleotide probes derived from 2 
exons trapped from PAC 273L24. Solution hybridization was carried out using the 
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GeneTrapper cDNA Positive Selection System (Life Technologies, Gaithersburg, 
MD). 

Two trapped exons, v66 and v75, were used as starting material. PCR 
screening of Superscript cDNA libraries (Life Technologies, Gaithersburg, MD) 
5 derived from human brain, liver, leukocytes, spleen, and testis were used to 

determine the tissue-specific expression of these exons. GeneTrapper experiments 
were performed with sense and antisense primers from both exons, assuming both 
orientations of these exons in the putative transcript. 

The following oligonucleotides were synthesized and PAGE-purified: 
1 0 v66GTl : AAG CTC ACT GCC TTC TCC TC; [SEQ ID NO: 28] 
v66GT2: GAG GAG AAG GCA GTG AGC TT; [SEQ ID NO: 29] 
v75GTl : GAC TTG GAA ACA AGT GGG AG; [SEQ ID NO: 30] 
v75GT2: CTC CCA CTT GTT TCC AAG TC. [SEQ ID NO: 31] 

Oligos were biotinylated, hybridized to single-stranded DNA from the 
1 5 leukocyte cDN A library (one primer per reaction), followed by cDNA capture using 
paramagnetic streptavidin beads and repair using the corresponding non— biotinylated 

oligos. Colony hybridization of lifts using 32p-dCTP end-labeled oligos was used 
to identify positive clones. Gel-purified inserts from these clones were hybridized to 
cosmid contig blots in order to distinguish cDNA clones mapping to the FMF region 
20 from false positive clones due to homologous domains. All positive clones were 

identified by the primers v66GT2 and v75GT2, and no clones were identified by the 
other set of primers. 

Characterization of cDNA v75-l 
25 The translated v75-l cDNA sequence is shown in Figure 2. The exon— 

intron structure deduced from the genomic sequence of two cosmids is depicted in 
Figure 
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3C. Shaded boxes represent exons; introns are drawn to scale. The numbers above 
the boxes represent the size of the exons in bp. The numbers below the boxes reflect 
the order of the exons with 1 being the most 5\ 

Although there is an excellent Kozak consensus (Kozak, supra) at the initial 
5 methionine, the reading frame remains open in the cDNA upstream. There are no 
splice-acceptor consensus sequences or in-frame methionines with good Kozak 
sequences before the first stop upstream in the genomic DNA. Additionally, the 
transcript size by Northern blot is 3.7 kb. The estimated transcript size from cDNA 
is 3503 nucleotides. Therefore, the sequence appears to be the full-length sequence. 

10 

Example 2. mutational analysis 

Three different v75-l mutants of FMF carrier chromosomes in multiple 
ethnic groups are not seen in a panel of almost 300 normal control chromosomes. 

1 5 This indicates that v75-l is a cDNA of MEFV, the gene associated with FMF. 

Three missense mutations were identified in exon 10 of v75-l (Figure 5) 
after screening a total of 165 individuals from 65 families. All three mutations are 
clustered within 46 amino acids of one another in the highly conserved rfp (B30.2) 
globular domain at the C-terminal end of the predicted protein. The first mutation, is 

20 a G O C transversion at nucleotide 2040 in which methionine is replaced by 

isoleucine (M680I). This mutation was observed in the homozygous state in the 
affected offspring of a single Armenian family. The second mutation isaAOG 
transition at nucleotide 2080 in which methionine is replaced by valine (M694V). 
This was observed in a large number of affected individuals bearing four apparently 

25 distinct disease associated haplotypes. The third mutation is a T O C transition at 
nucleotide 2177 which substitutes alanine for valine (V726A). It was observed in 
affected individuals bearing the C haplotype in a Druze family and in other FMF 
patients and carriers bearing this haplotype. An additional mutation in which lysing 
is replaced by arginine at position 695 (K695R) was observed in an American FMF 

30 patient of Northern European ancestry. 

Direct sequencing of RT-PCR products or amplified exons from the 8 
cDNAs telomeric to v75-l failed to identify disease-associated mutations. 
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It is extremely unlikely that the substitutions in v75-l are actually 
polymorphisms in tight linkage disequilibrium with "real" mutations on a nearby 
gene. This hypothesis would require that there be 3 such v75-l polymorphisms on 3 
different haplotypes, each in perfect linkage disequilibrium with the mutations on 
5 the "real" FMF gene. While not impossible, such a scenario is at least unnecessarily 
complex. It is also unclear where such a closely linked gene would be located. The 
historical recombinants at the 5' (centromeric) end of v75-l exclude the interval 
between D16S3373 and v75-l . On the telomeric side, the 5' end of a novel zinc 
finger gene is located within 10 kb of the 3 5 end of v75-l, but thorough screening 

10 has revealed no mutations in this later gene (data not shown). Moreover, there are 
no trapped exons, direct selected cDNAs or expressed sequence tag (EST) hits that 
map to the interval between them. Finally, and most importantly, the observation of 
normal chromosomes that bear disease— associated microsatellite and SNP 
haplotypes but do not have the M680I, M694V or V726A mutations is strong 

1 5 evidence that these are not just haplotype— specific polymorphisms. 

Mutation Detection bv Fluorescent Sequencing 

The entire coding region was sequenced, plus splice cites, in individuals 
representing seven microsatellite haplotypes. Approximately 100 ng of genomic 
20 DNA template was used in PCR reactions to amplify exons and flanking intronic 

sequences according to the supplier's recommendations for AmpliTaq Gold (Perkin 
Elmer, Branchburg, NJ) and Advantage-GC Genomic PCR Kit (Clontech, Palo 
Alto, CA). 

The PCR primers were tailed with one of the following sequences: 
25 -21M13 forward: GTA AAA CGA CGG CCA GT; [SEQ ID NO: 32] 

-28 M13 reverse: CAG GAA ACA GCT ATG ACC AT; [SEQ ID NO: 33] 
-40 Ml 3 forward: GTT TTC CCA GTC ACG ACG. [SEQ ID NO: 34] 

After amplification, reactions were run on 1% agarose gels and gel purified 
using either QIAquick gel extraction kit (QIAGEN, Santa Clarita, CA) or 
30 Microcon/Micropure/Gel Nebulizer system (Amicon, Beverly, MA). Alternatively, 
PCR products were column purified with Microcon-100 (Amicon). Purified 
amplicons were sequenced with dye primer chemistry (PE Applied Biosystems, or 
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Amersham, Cleveland, OH). Sequencing reactions were ethanol precipitated and run 
on an ABI 377 automatated sequencer. Sequence data were analyzed with either 
Autoassembler 1.4 (PE Applied Biosys terns, Branchburg, NJ) or Sequencher 3.0 
(Gene Codes Inc., Ann Arbor, MI). 

5 

Example 3. Protein Modeling 

The deduced amino acid sequence was examined. Two overlapping nuclear 
targeting signals were detected using the PSORT algorithm (Nakai and Kanehisa, "A 

10 knowledge base for predicting protein localization sites in eukaryotic cells," 

Genomics , 14:897-91 1 (1992). The first nuclear targeting signal is a four residue 
pattern composed of a histidine and three lysines. The second is a Robbins/Dingwall 
consensus (Robbins et al., "Two interdependent basic domains in nucleoplasmin 
nuclear targeting sequence: identification of a class of bipartite nuclear targeting 

15 sequence," Cell , 615-523 (1991). A bZIP transcription factor basic domain (Shuman 
et al., "Evidence of changes in protease sensitivity and subunit exchange rate on 
DNA binding by C/EBP, Science , 249:771-774 (1990) was identified using a 
PROSITE search (Bairoch et al., "The PROSITE database, its status in 1997," 
Nucleic Acid Res. , 25:217-221 (1997)). The spacing of cystine and histidine 

20 residues between residues 375 and 407 (denoted by plus signs in Figure 2) resembles 
a B-box type zinc finger domain (Reddy et al., "A novel zinc finger coiled-coil 
domain in a family of nuclear proteins," Trends Biochem. Sci. , 17:344-345 (1992)). 

25 Example 4. localizing expression of the protein 

The tissues in which v75-l is expressed are highly consistent with the 
clinical phenotype for FMR Based on the nature of the inflammatory infiltrate and 
the anatomic localization of inflammation in FMF, AfEFFgene expression might be 
30 predicted to be observed in granulocytes and/or serosal cells. Multiple tissue 
northern blots demonstrated high levels of expression in peripheral blood 
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leukocytes, primarily in mature granulocytes, but not in lymph nodes, spleen or 
thymus which are comprised largely of lymphocytes. 

Figure 4 shows the expression profile for the v75-l gene. Figure 4 A shows 
the results of hybridization of a probe derived from exon 2 on multiple tissue 
5 Northern blots. A 3.7 kb transcript was found in peripheral blood leukocytes (PBL) 
and colorectal adenocarcinoma (SW480). The presence of the transcript in 
peripheral blood leukocytes compare favorably with the symptoms associated with 
FMF. The detection of the 3.7 transcript in colorectal adenocarcinoma is 
unexplained. 

1 0 Figure 4B shows hybridization of the same exon 2 probe on Northern blots 

with mRNA from purified Polymorphonuclear leukocytes (PMNs) and lymphocytes. 

PMN lanes represent preparations from different individuals. A P-actin control can 

be seen at the base of the gel. 

The following abbreviations were used in Figure 4: HL-60 (promyelocytic 
15 leukemia); K-562 (erythroleukemia); MOLT4 (lymphoblastic leukemia); A549 (lung 

carcinoma); andG361 (melanoma). 



Northern Blot Analysis 

To determine transcript size and level of expression in various tissues, 

20 multiple tissue Northern blots (Clontech) were hybridized with probes derived from 
various exons of the gene. These exons were amplified and purified as part of the 
sequencing protocol for mutation analysis. Larger exons (2, 5, and 10) were labeled 
by random-priming using Stratagene Prime-It Kit and 32p.dCTP (ICN). 
Hybridization and washing of blots were essentially as described in Sambrook et al., 

25 Molecular Cloning. A Laboratory Manual , Cold Spring Harbor, New York: Cold 
Spring Harbor Press (1989), except using Hybridisol I (Oncor) prepared 
hybridization buffer. Hybridization was detected by autoradiography, with 4 hour 
exposures. Northern blots with mRNA from highly purified peripheral blood 
lymphocytes, PMNs, and monocytes were the kind gift of Drs. H. Lee Tiffany and 

30 Harry Malech. 
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Example 5. Homologies to other proteins 

Figure 6 shows the alignment of the rfp (B30.2) domain of pyrin with 
5 homologous proteins. The following abbreviations are used in Figure 6: hum-RFP 
(RET finger protein; SWISS-PROT P14373); xla-xnf7 (nuclear phosphoprotein 
xnf7, Xenopus laevis; PIR A43906); pwa-A33 (zinc-binding protein A33, 
Pleurodeles waltl; SWISS-PROT Q02084); hum-SS-A/Ro (52 kDa RO protein; 
SWISS-PROT PI 9474); hum-afp (acid finger protein; GenBank U09825); hum-BT 
10 (butyrophilin; GenBank U90552); hum-efp (estrogen-responsive finger protein; PIR 
A49656); hum-B30-2 (B30-2 gene; PRF 2002339); pig-RFB30 (ring finger protein 
RFB30, Sus scrofa; EMBL Z97403); hum-Staf-50 (transcription regulator Staf-50; 
IRA57041). 

The invention has been described with reference to various specific and 
1 5 preferred embodiments and techniques. However, it should be understood that many 
variations and modifications may be made while remaining within the spirit and 
scope of the invention. All publications in this specification are indicative of the 
level of ordinary skill in the art to which this invention pertains. All publications 
and patent applications are herein incorporated by reference to the same extent as if 
20 each individual publication or patent application was specifically and individually 
indicated by reference. 
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What is Claimed is: 



1 . A nucleic acid sequence encoding pyrin. 

2. The nucleic acid sequence of claim 1 , comprising the coding sequence of 
SEQ ID NO: 2 and variations thereof permitted by genetic code degeneracy. 

3. A nucleic acid sequence encoding a familial Mediterranean fever-associated 
mutant of pyrin. 

4. The nucleic acid sequence of claim 3, comprising a mutant pyrin having an 
amino acid substitution in a rfp (B30.2) domain [SEQ ID NO: 5]. 

5. The nucleic acid sequence of claim 3, encoding mutant pyrin comprising the 
amino acid sequence of SEQ ID NO: 7, SEQ ID NO: % SEQ ID NO: 11 or 
SEQ ID NO: 13. 

6. A nucleic acid probe or primer comprising at least fifteen consecutive nucleic 
acids of MEFV [SEQ ID NO: 1] or a familial Mediterranean fever- 
associated mutant thereof. 

7. The nucleic acid probe of claim 6, wherein the probe hybridizes to MEFV 
under stringent conditions. 

8. The nucleic acid probe of claim 6, wherein the probe hybridizes mutant 
MEFV under stringent conditions, the mutant MEFV comprising a nucleic 
acid sequence of SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10 or SEQ 
ID NO: 12. 

9. The nucleic acid primer of claim 6, wherein the primer amplifies MEFV. 
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10. The nucleic acid primer of claim 6, wherein the primer amplifies a nucleic 
acid sequence encoding a rfp (B30.2) domain of pyrin. 

1 1 . An amino acid sequence comprising SEQ ID NO: 3. 

12. An amino acid sequence encoding a familial Mediterranean fever-associated 
mutant of pyrin. 

13. The amino acid sequence of claim 12, wherein the mutant comprises one or 
more amino acid substitutions. 

14. The amino acid sequence of claim 12, wherein the mutant comprises an 
amino acid substitution in a rfp (B30.2) domain. 

15. The amino acid sequence of claim 12, wherein the mutant comprises an 
amino acid substitution in at least one of amino acid residues 680, 694, 695 
or 726. 

16. The amino acid sequence of claim 12, wherein the mutant comprises an 
amino acid substitution corresponding to M680I, M694V, K695R, or 
V726A. 

17. The amino acid sequence of claim 12, wherein the mutant comprises SEQ 
ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11 or SEQ ID NO: 13. 

18. An amino acid sequence encoding pyrin comprising an rfp (B30.2) domain 
of pyrin [SEQ ID NO: 5] . 

19. The amino acid sequence of claim 18, comprising an amino acid substitution 
at residue 680, 694, 695, or 726. 
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20. The amino acid sequence of claim 19, wherein the substitution comprises 
M680I, M694V, K695R, or V726A. 

21 . An anti-pyrin antibody that specifically binds wild type pyrin [SEQ ID NO: 
3]. 

22. The antibody of claim 21 , wherein the antibody specifically binds to an 
epitope in a rfp (B30.2) domain. 

23. An anti-pyrin antibody which specifically binds familial Mediterranean 
fever-associated mutant pyrin. 

24. The anti-pyrin antibody of claim 23, wherein the antibody specifically binds 
to a mutation in a rfp (B30.2) domain. 

25. The anti-pyrin antibody of claim 23, wherein the antibody specifically binds 
to pyrin comprising a mutation at residue 680, 694, 695, or 726. 

26. The anti-pyrin antibody of claim 23, wherein the antibody specifically binds 
to mutant pyrin comprising M680I, M694V, K695R, or V726A. 

27. The anti-pyrin antibody of claim 23, wherein the antibody specifically binds 
to mutant pyrin comprising the amino acid sequence of SEQ ID NO: 7, 
SEQ ID NO: 9, SEQ ID NO: 11 or SEQ ID NO: 13. 

28. A vector comprising a nucleic acid sequence encoding pyrin [SEQ ID NO: 
2] or a familial Mediterranean fever-associated mutant thereof, operably 
linked to a functional promoter. 

29. A host cell transformed with the vector of claim 30. 

30. A kit for diagnostic assay comprising: 
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a nucleic acid sequence encoding wild-type pyrin; and 

at least one nucleic acid sequence encoding a mutant pyrin. 

31. A kit for diagnostic assay comprising: 

an anti-pyrin antibody which binds wild-type pyrin; and 
at least one anti-pyrin antibody which binds mutant pyrin. 

32. A kit for diagnostic assay comprising: 

at least one pair of primers which amplify a nucleic acid sequence encoding 
pyrin. 

33. The kit of claim 32, wherein the primers amplify a nucleic acid sequence 
encoding a rfp (B30.2) domain. 

34. A method for diagnosing risk of familial Mediterranean fever (FMF), 
comprising: 

analyzing a patient sample for an amino acid sequence of pyrin; and 
correlating detection of mutated amino acid sequence with risk of developing 
FMF. 

35. The method of claim 34, wherein analyzing comprises contacting the sample 
with an anti-pyrin antibody and correlating antibody binding with the 
presence of pyrin in the sample. 

36. A method for diagnosing risk of familial Mediterranean fever (FMF), 
comprising: 

analyzing a patient sample for a nucleic acid sequence encoding pyrin; and 
correlating detection of mutated nucleic acid sequence with risk of 
developing FMF. 

37. The method of claim 36, wherein analyzing comprises contacting the patient 
sample with a labeled nucleic acid sequence encoding wild type or mutant 
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pyrin and correlating hybridization with the presence of wild type or mutant 
pyrin. 

38. The method of claim 36, wherein analyzing comprises sequencing the 
nucleic acid sequence of pyrin. 

39. The method of claim 36, wherein analyzing comprises sequencing or 
hybridization of a nucleic acid sequence encoding a rfp (B30.2) domain. 

40. A method for producing pyrin in a host cell comprising transforming the host 
cell with a nucleic acid sequence encoding pyrin. 

41 . The method of claim 40 wherein the host cell is an animal cell. 

42. The method of claim 40 wherein the host cell is a mammalian cell. 

43. The method of claim 40 wherein the host cell is a human cell. 

44. The method of claim 40 wherein the host cell expresses mutant pyrin prior to 
transformation. 

45. A transgenic animal expressing heterologous wild type pyrin or mutant 
pyrin. 

46. A method for screening compounds for use in FMF therapy comprising: 
administering candidate compounds to the transgenic animal of claim 45. 

47. A method for screening compounds for use in inflammatory disease, 
comprising administering the compounds to the transgenic animal of claim 
45. 
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FIG. I 

i TATITTIGIA TTTEAGTAGA GATOGGGTTT ACIGTGTIQG GCAGGCIGGT CnGTACTOC 

6i CAACCT3AQG 'lGATOCAGOC AXIOGGXXT COCAAAGIGC TOQGATEACA GGOGTEAGCA 

121 CI GI G OX'IG COOOCAACAT GIBA LTIL ' IU TEAGCTICAA AGOCACCICr G GQGOGL'IGC 

i8i PCCPCKEKTG AGCIGAAGGA. CAO30GTG0C TITIUACOOG IGTAGL'IUJA. GCATCTIGGC 

241 ACACIGTCIA GAATOTICAA. TGAA3GTQCA OGGAAGAGCA. TICIGGCICC aGGGAGOGAG 

301 GACIGAGTCA QZICIQ3GAA CAGA1GAGTC AGGCIGGIGG TOCAQQCAJT GLTJ.TJAJAAG 

36i ' l ULT lCA UUl' GXTDGGAAGA. AOCAGTCAAC TGGAACGGGA TCaACAQGGG TGATOGCAIG 

421 GOtfGAGTIA TCICCIQQCA OIUUULTllT GGGCICACTr GOCTICTIGG GOCAQGAAAG 

48i GCAAAGCICA CSGGACIGTA. TTCAGTGOOC PCCCCTTCCC COGIOCIGIG CCATIGGCIC 

541 TGGAAQGICC CTCAAAOOOC GAGICIGGAG GAGAACAGTT GACCAGCAGG G03QGOX1C 

60i AGCATAGTOC TCICTGTICC OVTICAOXG C1C1G0CAGC OGCAGAOXXrr GQCAGGAAGG 

66i AAGATIGGAG G333IGICIG GAATOCAATC CCAGAGCTIC CCTTGCAGAC TIGGQCATCT 

721 GICIGIGSIC TAGIGIGGAG GOGAGGTOCA G3GITIQ3GA Q3GGIGIQGG QQCACATOIC 

781 TOOOtfGQCA TGGAGOCCIC CCAGCIGGAA AATOCICIGA AOCTOEAAGA AGAGAACACA 
84i GGOGGC A TOG ACACAOXTr AO0CTEAGIC TC2GTIOTA OCAAGACACA GAGCAU.T1UJ 

9oi ' luiuuL ' rrrr oaocnaTrc acaaocigcc titicitgct caccaaggac agaqgcitct 

961 TTIOCEACCA GAAG QCAGAC AGCIGGC1CG ALUL'IC'IOCT GCICAGCAOC fATfoCTAAGA 

1021 QOOCnaGIGA. U L K ll' l GClG TCX3QX'1GG AGGAGCTGGT UULUimGAC TIOGAGAAGT 

losi TCAAGTICAA GCIGCAGAAC AOJAG1G1GC AGAAQGAGCA C7ICCAQGATC QQOCGGAQCC 

ii4i AGATDCAGAG AGOCAGGCOG GIGAAGA1GG OCAl'lL'lULT G3K2CCTPC TATGGGGAAG 

1201 AGiaoxasr qctqcicacc cigc a ugio: amaoaacAT caAOCAGoac cigciggcog 

1261 AGGAGCTOCA c?¥fflT'#rr AT IC A GGG TA AGCGGGOCC A GUUL'IGL'IOC TCATCCAGTG 
1321 CIGAGIGCIG GCIGCTTIGT GQGAAAGQQG AGCAGGAGCT CAGAGCAGCT CaCICIGAOC 
1381 TOGSGATIGG GAGTCICAQG TCTAGCAAAA TCCAGAT3AC TITAGTICAG GAACGTQCCT 
i44i TICTICACTC 'IGUULTIMGG AACIQQGITA GEAAACTICC TTCAG3CT0C TAATGGGTTT 
isol TTEAAGAAGC AGGTCAGGGT CAOGAAAQQC AGGAGCIGGA ACAGCIGTIC TTIGAGACrr 
i56i CTICACEACA TTTKEGATIA ATACTCAIGT CAGACAAACA TCICTAGGIT AGCAAAAAGG 
i62i GATIGCEATC CAATCATATC AACGGGGTIG GTATAGAATC TICICAGTOC 1G1TUAGCAT 
i68i GnGQOCAQG C l GGl C r O GA , ACTOCIGACC TCAAGIGATC CTOOOQOCJDC AGCCIOCCAA 
1741 AGIQCIG3GA TTICAGACAT AGGOCAGGGT ULUXAJLTIA TlTlTATnT TAAAGOGTAT 
leoi AATCIQQGIT TIGCIGAOCT GIGTAAGATC TTATTIGAAA CAGT1G1ULT GCTEAAAAGG 
i86i TTIGAAAAGT ACEATTIGAG AAATATAGGC TAGGCATOGT GGCICACACT TAIAAATAAT 
1921 CICAQCACIT T3GGAGGCTA AGGTOGGIGG ATIGCTAGAG CICAQGAGIT TGAGADCAGC 
1981 TIGQQCAACA 'IGGI GA AAQC CTGICICIAC CAAAAATACA AAAAAATCAG CCAQQOGIGS 
2041 TAGCACACAC ClUiati ' lTlU AGCTATTGAA AAAACAGAAA ACAGQCIGAG GT3AGAGGAT 
2101 TGCTIGAGOC TQGGAGGCAG AGGTIGCAGT GAGCIGAGAT CACATCAQQG CAACAGAGCA 
2161 AGATOCIGTC TCAAAAAATA AAATAAGAGA GAGAGAAATA CATAGCAACA TCAAGCATOT 
2221 TCTTACTGAA TOGTAATIGA CIGOCATIGT CEAGTCIGGG NAGICCIGAA CriTlUlTlT 
228i TGAGATGGAG ' ILTIGC ' ICI G TCPCTCN332 TOGAGTGCAG TGGOGGGATC TCAGCK3SCT 
2341 G^AACCTOCA CATGCOQGGC TCAAGCGATT CICATOOCTC AGOCiaGOGA GTAGCIGGGA. 
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FIG. KCONT.) 

2401 CTOGGTGC QCACCAOCDC GTCIGGCTGA GTTTCTTATr TITaGISGGA AGQGQSTTIT 
246i GGCASGTTGG CCMGCIQGT CTOGAfiCTOC T3ACCTCAAA TGAICC'IUJC ACCTIGQCCT 
2521 CTQGAGAAGC TOOGfiTraCA GGCA3QOGCA OC3¥33CICaG LT12AJ1TJ.T1U TKTITTIMr 
258i AGAGACQGGG Tl'lCAGOCIG TIGGTCTTGA ACICCTGATC TCAGGTGATC ClUGGLjUL'IC 
264i QQCOCnaG AUIUOJUUA AIACAGGCAT GAGGCACOQC GaXJQGOOCG TlUlTl ' lULT 

2701 caKinciaA actttaatat gcaaggggat iciuic ' ic ct cigqocigaa tctigggccc 

2761 TAAAG3TGGG ACAGLT1CAT CATITIUCAT ClUJl ' miUU TICCAGAATA TIQ2ACACAA. 

2821 GftAAAOQGCA CAGATCATIC Q3CAG3GHHL' AOJIUUL'IQG GQGAGAACAA QOOCAQGAQC 

2881 CIGAAGACIC CAGAGCAOOC 03AGGGGAAC GAQGGGAAQG 00000330.' GWJCfJGmj 

2941 GGAOJIOXA OX:il3GGmG CAGOCAQOOC GAQGOZGGGA GQGQGCTG'IC GAQGAAGOCC 

3001 CIGAGCAAftC Q2AGAGAGAA GGCCTGGGAG (JUULVLOGflCG CGCAQ33CAA GOJlULJLaAOC: 

3061 OQGAGCQDGG UUL ' lUUUULK OGGGAGAAGC CCEU3XCCT GCAG3GGGCT AGAQGGGGGC 

3121 CAQQGGGAGG '1LU3X1UGS GAGAAAGGCC AGC1CO30GG QGAGGCIGCA a:m'?lGUCG 

3i8i aayjcmxr: cgggqcagaa qgagigcaqg c u lt j xii jA ag t3taociqcc ctggggaaag 

3241 ATGOGACCTA GAftJULTllA QglCACCATT TCTACAGGGG AGAAQG030C QGCAAATOCA 

3301 GAAATICIOC IGMl'ICTAGA QGAAAAGACA GCIGOGAATC TOGACTCQGC AACAGAAOOC 

3361 CGGGCAAGGC a^'lCUJGA IGGAQ333CA 'IC'lOGGaflOC TGAAGGAAGG OCCIGGAAAT 

3421 GCAGAACATT (JLLMCAOOGG TAAATTGTGT 'lL'i'l'lUJAAC TTEATATO3G CTGCAGAGAA 

3481 AGAATOGCTG GOQQQQCAGG ATAGCICATC GCIGTAATOC CAQOQCITIG GGAQGGCAGG 

3S4i GGGGGAQGAT T3CTGGAGGC CAAGACTTIG AGAGCAGOCT QGTGAAIGTA. GTGAGAGODC 

3601 OJUU & 1C11T ATAAAGGAAA TEAAAAAAAT AAAAAGGCAA AGGTTGGGCA GQGQGTGGTA. 

366i Gcic'iaaocT gtaatoocag aqctttgaga qggcigcagg GGAQGATCIC TIGAGOCCAG 

3721 GAGTTOCATA CTAGOCTAGG CAACACAGIG AGAGCGCATC TCIACAAAAT ACAATAGIGG 
3781 CAOQGQGdG TAU1UJJAGC TGCTOGGGTT CACITGAGCA GAOGGAGTIC CAGGCTACAG 
384i TCAGCTGAGG ATCA3QGCAC TGCACACCAG GCIGAQCAAC GTAGOCAGAC TCACTICTAC 
3901 AAAACTAAAA AAAAAATTAG CTGGGTAIGG T3GCACAGGC CIGTAATICT AQGCACICAG 
396i GAAGCIGAGG CAGGAQSATT GCTTGAQOCA GGGAGTTGCA GGCIGCAGIG AGCIGAGGAT 
4021 GTGOCACIGC AL'IUJGGOIT QGGCAACAGA GCAAGAGGCT GICICTTAAA CAl'l'l'lUGGG 
4081 GGAAAAAAAA AGAAAGAAAG AATOIGOGAT TGAAAAAGGC AATCAQGTGT TATCAGTQGC 
4i4i CAAAGAA1GG AGAAGGGGAG CICAQJIL'IG CAGULGIL'IG CTIGGCAGGG AIGGGAGGCA 
4201 GGGGGATTTT AGAGTCCAQG GAGGGGAAGG GAGATAGGTA AGCAGGOOCA GGGCAGGGTT 
426i CCKEKIGIGC AQGGGCIGTC GGCAGCAIGC TTCTIGCTAC ATQQCATICA AACAAAGGCT 
4321 TCTCCMCIT C1TI1AGGGGA GGACOCTTTA GCTEATAACC ATCIGTAAAT GATOC7EAAGG 
4381 TAACIGGAAG TCACCICTTC CAGITIQCAC lUUl ' l ' l ' lULT CIGftTCITAA. CTlUL'lUlUi 
4441 TITTIGGCAA GGGATCAQGA GGCIGCAQGC CATCIGGATT TTITTAAGCA GCIGTGGCEA 
4501 TAQ3TAAAGA GACEAAAAAA AAACIGTAAA AGAAAAAT3C CACCAGTTEA GAGGGTADOG 
456i AGGCEATOCA GGTGACAAIT CCA3GCT03T QGIGGGGGCA GCATTCAGAA ACACACITrC 
462i CrmTlTlC L'lUL'lTITlT TITITGAGAC AGAGTCICAG 'ICIGTCTOGC A3GCTGGAGT 
4681 GCAGTAGIGT GAGCACAGTT TACTGCAGGC TCAACC'IU-T AQGCTCAAQC GAT0CT00CA 
4741 CCTCAGGCTT CCAAGEAGCT GAGACTATAG GTGC1CAOGA GCACACCTGG TEAAXlTl'rr 
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FIG . 1 (CONT. ) 

4801 TmTJ.TJ.Tr TCTATTETIT GH2AGTEA03A GGACIGICEA TGaTOOCCAG QCIQ3T3TIG 
4861 AACICTTGGG CTCAAGGGAT OOOOGUULTr AGOCTCTAAA AGIGCTAGGA TlTvJAUUIUl ' 
4921 GAGTCACTAC AQGCAULLTA TGGAACACAC TTTOCAATOC A TlUl ' lUUL T GGAGAGGAGA 
4981 AATCACAGCA CICAAGGAQG AGAAATAGAA TKJ3333IDC A33G0GGG1G GGGIGGCICA 
5041 TAOCTSCAAT OOCAOCACIT 7GQGAGQ0CA ATOGGGQCGG A3CACCIGAG GTGAGGAGTT 
5101 GGAGAOCAQC CTGOCAACAT GGTCAAAOGC CAICTCEACT AAAAATACEA AATTIQCIQG 
5i6i QOGIQC3IGQC GGGTGIOCAT AATOOCAGCT ACICAGAAQS CTIGGAGGCA GGAGAATIGC 
5221 TIGAAGOGAG GAGQCAGAGG TIQCAGTGAG GCAAGATCAT QCCACIQCAC TdAGOCIGG 
5281 GOGACAAGAG CAAAACTCTG 1CICAAAAAA AAAAAAAAAA AAGAATIGGG AGICCAGGGA 
5341 GCHLTCAGAC CTOQGAGQGG AAAQGATCTG GTA3GCIGCA T3AGTCTICA AATOCAGAAG 
5401 TOCCTGGGIC TTCCAGIGAG AAAGGAOGCT GQGATCIGGA AAAOCTAGCA TOCITAGGAA 
5461 TAGIGAUL'IG AAAAGTACIG AAGTATITOC OOOCIAAnT TCTITEATCC CTACIGTATT 
5521 TITTITAATr TriTlTlTlT TTEAGATATC G33ICTIGCT AIUl ' lG OCCA GGTK33ICIC 
558i GAACTOCIGA. TCICAAACAA TOCT30CATC Tl'lGOL'lUUG AAACIGC1G G GATEACAGGT 
5641 GIGCAGCACT QCACCAQGIC CDCACIGTAT TTATATCATT QQGATIGCIG GG3X3ICTTCT 
57oi AGGQOOGCTT G3TEAA3CIG -ATOCAGQCTr AGAGGCIGAA AAA3X3CATAT AIGCACAGCT 
5761 TCACAAAIGT CACATCAAAT TICAQGTAGT '1LT1ULJACAC TC3GAAGAOC ATCTTTAGAA 
5821 TOCAAGGGGT TEATOGACAC CAGGEAGAAA ATCIGGGGAA GACTGGTEAA AAATOCTaOC 
5881 TCTCACAATA. ACCICACAGC AATOCATCAT CATOQQGTIG AGATTCEAGC Al ' lUULTlTU 
5941 '1CICAGCAGA AAGAAAAGOC TATIG3CIAA AGIDCEAACT ATCTOCTGCT GAGGTAGICA 
60oi TEAAAATEAT GTITSGTIGT GAATAATAGA AACAGOCAAA TAACAGTAAC CTCAACAGAA 
6061 AAGAAGTTIG TGOCTOCTIC ACATAAATCA TACACAQGOG GTOCCAGQCA GAJC03IGGG 
em GCAGGAGGCT GQQGTOCTQC TGTIGCICIG TCOCAGCAAG T1TJ1ULTCA . A G L T IL' IULT 
6181 CICAGAAGGT GAGGTOCTCA TQGCAQGCAG CAAGATQGAG GAACAGAGQG GAACAGEATC 
6241 OCIGGGGAAA GCTGEAGAAG T1TLTAGAAG CIGCTTGTGA CAOCIOCATE TACA 1ULXJJ.T 
6301 TGGTCATATT ATIGTCAAAT AGOCACAGCT AACIGCAAAG GAGGCIGAGA. AATOCAGQGC 
6361 ATTTGOGGGG CAAT3GGAGG CAG3GAAACA GGGAAA03IG GACAATEAAT TCTATCAOGA. 
642i GAGAAGGAGG GAGAGTAATT TCIGGTGACT ACTAGCAGIC TCATTTACAG AIGTGC'IGTG 
6481 AATTICIGGG ACACIGTGAG GTGGGAQGAG GTAGCAGGGG CTAAAGGATT GA G1U1U1TT 
654i C1MTJ.LTJ.T T1T1UJ.T1TT TITITTITIG AGATSGAGTC TCICTIQGIC AGOCAGACTG 
660i GAGIGCAGTG GCGCAACTTC AGCICACIQC AAACraOSOC T0CD3G3TTC AAGCAATTCT 
6661 C3CTGOCTCAG GCIOOOGAGT AQCTQQGATT ACAG3TQ0CC AOCAGCAGGT 003GCTAATT 
6721 TITUJmTIT TAGTAGAGAC AG33ITTCAC CATCTIGQCC A3GCT33TCT TGAACrOCTG 
6781 AOCICA3GAC OCAOGOGOCT OGGOC TO X A AAGIGCIGGG ATEACAQGOG T3AG0CACTG 
6841 CUL'lUGGOLT TJ1UJ.T1LTA TlTLTlL ' lTo TA TL ' lUUIGG (ZA IUIL ' IOLT TATGAAGTIG 
6901 CAATEAGAGT CTTGGAGTAG AQCEATTCAT AACTGTEAGG TCrTCATCAT GAGITGCAGT 
6961 CTTEAGOGCT ATAATOOQOC (JLT1LT1TUC TITTIITTIT AAGATOQCAT CITACICIGT 
7021 TGCCCAQGCT QGAGTGCAGT QGIGCAGCAT CAAGCTOCTA GGTK2AAGCA ATUL'lUl'ICT 
708i CICAGGCIDC CAAGTAGCTG QGATEAGAQG TGTQCAOCAC CACACdGGC TAAIiTlTlTA 

7i4i ATrrrriGEA gaggigggct ctigocatct tooocaggct ggtcicaaac tocigagctt 
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7201 AAGCAGTGCT OCCAGCTIGG OCICOCAAAG CACIGGGATT AIAGGCA3GA GOCBCCAOGC 
7261 AUUUULT1LT TTQCmCAT TTAAIQCJITA. TTCAACTCAT AIGIGAGCAG 1 U G1LTAT1T 
7321 ATTDCTICAT TGAA3BCDCA. TTTICCAAAT ULT1UJATIT GOCAGGEACT C1UL'1!AGGGG 
7381 CIGGGATOCA GCIAGGAGOG AGGEACACAA GICAGCATCC OCTO3AA30C TOCACICAOG 
7441 TTATO33CSG (rAGGGAOGG GTICAAGTGG CBAAGGAACA CIGGTCAGAA lUlL ' lLTl ' JLL ' 
7501 CTIGGCATCA. LL'lUL'llAGAT CTAIGTCIGT QCMGfiQGAA CAGCACAAGG CX2AIG3GIUT 
7561 T1UJ.T1SAGGA. TAAATGOOCA AGAATICCAA GGCICAGGAA. TGICK3RQGT CIULXJULTIA 

7621 oacicfiQac a^TGoacr Grnocna: tcaciggatc GAAGroooaG gaggacaagc 

7681 TAGGAAGIQG GCAGAGIdA ACTCAGAACT OGCACA3CIC AGGCAAQGGC TGTGTOGGCT 

7741 GTOdTIGIG ATAUL'lL'lUr GTAAQCAACT TOGSTTIOCC ATICAGG335 'ITl'l'lUCACT 

780i GCATOICOaC AG GAAGGOCA. aaGACAOQG CTQCGPGTOC aaGC'lUJtJAC GCCCAGGAAG 

7861 ' GAGAQDGAGT TGAOGGIACC 1CTCTGOG1U Ai'lUL'lULMG LTIUJUULJAG (jUflUmUlU 

7921 GGO mXJU A. (JJLLM.UM3GC AGQQGC'ICAC ClOjCIGCXX' OCX331GGGAG Ga L ' lU U UM G 

7981 AAAQGAAGAG (^IGGG A AGC CEAAQQQOOC AGOJQCiaCC ACAGIGIAAG aaGCAOl'lGA. 

8041 AGCAGGICCA UL ' lUL ' llTlC TGIGAQGATC AOGMGftQJC CTTCimi'lC A1U1GCAL?1U 

sioi T5AGICAQGA . GCAGCAAQGC CaOGGGSIGC OIIiail'lGA QGAQGiaGOC CTQGAACACA. 

8i6i A9GTEAG3CAC 'lUUL'lUGCIG TGOGCICTIC TCTOOCAGQC ACTIGGACAC ACIGGGCCIT 

8221 ALT1CALLT1T OOCAACAACr CIG33ITGIT G31GCATEAA. GCAGCAT1LT TGGGCIGGAA 

8281 ATOQCAAGAA CACAAXATAA. AGCAGIGCAG CAAAGAGQGG AGCIACAGGT TEKDSITGCT 

8341 QGAGATCCA. GGQGGAGCTG GCFICAGGTA. TOQCIGAATC CAGfiGQCICA. GAGGAAGIGC 

8401 CIL'IL ^ X'IL' TGCTGOCTIT GGCAATICAG OCATKXTCC CTOCICTTIC CIGAGCACCC 

846i ctgooc a toc cgciggcaqc agcaoocica gqctiqctac cagaaggaga. Tsnmrrc 

8521 CAGAGTIGQC AOCAGCIAAA. GATOQCAQGA GOCAAATICA AULT1T1CAA. GAAGIGdGT 

8581 TTITQCAGAA. GAAAATICAG AAQCAGCIGG AQCATCIGAA GAAQCTGAGA. AAATCAQQQG 

8641 AQGAGCAQCG ATULTA1UGG GAGGRGAAGG CAGTGAGCTT 3dGGTAfiGG TGAGAGGIGG 

8701 CIGftDBaaX' AT0GGIO0CT GGGAGGAAGG TGGGAAGAGT GAGCAGGGGT OOOGGAGATT 

8761 CiaaO DQ BT TCACAGGGCA. GCAGGGATOG OCAGCTOCTC TCAGGGGACA. GAQGGTAACC 

882i AGCAQOCAAG GGTAAGCICA. TC0CIGTM3A. GGGAGACCAC OCOCAGCAGG CAQGGGTCAC 

8881 CICIGAGGAT (JL'IUICAHUC TTTCICATAC TCAOCAGAAG ATQGTAGAGA. GCAAOCIA3G 

8941 OCGGIGACTA CTGCAGAAAG ATOQGATIGA GGAAAAGGGA QGAGAAGQGC ALTl'lLTl'lT 

9001 TITSIGAGGG AGICICGCTC TGTCAOOCAG (Jl ' lUia G I GC AGTOGTGTGA. TCTTOGCICA. 

9061 CIGCAAOCTC TGOCTaOOGG GTICAAGGGA. TICTOCBGCC TCAGOC'IOCT GAGTAGCTQG 

9121 GATEATAGGT GAGTGOCACC ATOOCTOGCT AAlTlTlUiA GTITEAGTAG AGATOGGGIT 

9i8i TCAOCATCIT GGICAGGCIG "nCTOGAACT OCIGAACTOG TGATGOGGOC GCCTIGGOCT 

9241 OOCAAAGTAC TQGGATEACA. GATGTGAGGC AC1GGQ0OQG GOCAAGAACA. CITITAACTr 

9301 CATAATTEAC ' lClUlUlTlT TTIGTETIGT TICCAAGATC GAGTCTOGCT CIGTCAGOCA. 

9361 GQCIQGAGTA. CAGTGGCACG AiLTlUULTl ' GCICCAAGCT CXaOCTOGGA. GGITCAAGCA. 

9421 Ai'lL'llX'IUC CICAGOCIC3C TEAGIGGCIG GAATEACAGG GGOCIGGCAC OGOGGCIGGC 

9481 TAATnTTGT ATTnTAGTA GAGAOGGGAT TPCAGOGIGT '1GGCCAGGCT GGTCICAAAC 

9541 'lOl TCA CdC MGIGATOCA GCTGCCTOGG GCIOOCAAAG TGCTOGGATT ACAGGIGIGA 
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sew a xmub'iuc: GBsaacBsar TmTiurrr TnaGQGrrr Trrm ' mT tttittitga 

9661 GKD3GAATCT CACrOOGTOG TOCAGGCIOS GGTGCAGIGG T3CAAICIOG QCICACIGCA 

9721 Mocrrooa: tooocagtig amcamtct otgocicag cctcgcjgagt TScraaGAcr 

9781 GEAQGCACAT Q0CAOCACIC CIQQCEAATT TriUUKi'iTl' TAGTAAAGAC MAUITIUUL' 
9841 CAIGTiarE A3QCIQC3ICr aSAACTOCIG ATCICAAGTG ATCIGOOCAA CTCAGOCTCC 
9901 CAAAGTCCIG GGATTACAGA. CATCAGGCAA TOCAOCEAQC QCAAATTIOC GCATlTmrA. 
9961 AGACAACATT TATATIGGAT TAGGGAQOCA. ODCAAltDCA GTAGGACCAC ATCITAACTA. 
10021 ATEACATCIG CAAGAACICT TA3CIDCAAA TAAGATCACA. TOCTSAGTAC TQGGGGTEAG 
loosi QGCTICAACG TGrAAATTIT QGAAQ3GACA. CAGTEAAADC TTAACAOCAG GTITAAQGAC 
10141 ATJ.T1CCCAG AGCEAGOCXX: AGCCATGCTC AG1LT1T1LT GGAAGGTTOC AGACAAIATC 
10201 GOdCCIQCT CIGGAATCTA. QOOCTIGAAG AQQCAGGATA AGOOCACCIC TTAICCAOCT 
10261 CCAQ3AGGTG QQCTICIQQG GGTIOCIQGA. CATOCAOSIC CAOCCACAGC ACAGAOOOOC 
10321 ATACCIGOCT U IUJIUIUCT QCPCA GAAAC AAACIGAAQC QCTGAAQCAG GGGGTGCAGA. 
10381 GGAAQCIQGA, QGAGLJlUliAC TAL'l'lUL'lLG A3CftGCAAGA. QCKiTILTIT GIQGGCTCAC 
10441 TGGAGnAQGT QQQQCAGAIG G1UXJUULBGA, TCAGGAAGGC AIAJGACAOC (JUmimiOC 
10501 AQGACAJD3C UL'IUL'IUGAT GQ3CTGATIG GGGAACIQGA Q30CAAGGAG TOOCAGICAG 
losei AAIQQGAACT '1L'1UCAGL 3 1 G GGTGTQOCIG CjULXJUULjC'lT TCnGQSTOC UJlUiULXJlA 
10621 TCAGGATOOC Ul^JUL'lUG C AGCICIGOCA. TCAGOOSIGC TOGAACAAGT QOGTEGAAGOC 
10681 CEAAGGQCTA GGATAGGACT TGGICTIGGT GAGCCACAGT GOCTCTIGIG QOCAGAGOOC 
10741 TTIGAIGAGG 1CICKZAGGA. GCCCAQQGTG GOCTIGGTATC CAQQQGATCT CIQOCATITC 
10801 OZAGAAGGGA. TCAGCAGGGC TIGAGGGGOG TICXZATIGCA GOOCTOOXA OCIQGGA3QC 
losei CIGAATIOOC GIGGTEAGAA TEAGACTIGA. AGAAAG3TQC 1CCALT1ULA CIGACAGOCT 
10921 AGGQCAQQGA QQCCIGGTAA GIGCAG03G3 GAGCEAAAAG T0CAGGAG0C CAGAAGTAGA. 
10981 G3XAQGAGT CAQCOCAGCC ACEAGGAQOC TGGTAAO0GA. CAGl'l'lULTl' LTITITIUIU 
ii04i CTAQGACATT QGAGACAJCT TGGACAQGTA CAQ0GAGC3IC CIGK3SIGTA. OOCTOGQGTC 
nioi TCTIGCAGAA AQCAIA3QQG QGAGACAGTC OCAGAAQQGA CCIQ3GAQGG AGATCTTOOC 
liiei AAGOOQQQQG ' I C' l GI GftlTC CAGACIOCIC LTlTlTlt'lU CZAGCTTOOCA AAGOdCICT 
11221 GGATTTGAEA. GGGAGAAGQG CATCIG3TCA. GCAGQGAGGC 1GG00333TA. TOGAGCIGCA. 
ii28i GACIGQGAAG GGIGAATICA. GGCCATOCIG CIGAAACAAG AIQGAG3CIC CCTAAGAAAC 
ii34i CTIOCGAGIG CATIGTGTOC OJ1UUAGTIC A3XJIGA3GAA AJL'IQGOXT TCAGGOCIAC 
U40i ' lUmUUUL T l ' GQGAAGCTIG TTIGGAGIGG AGL'IUJUL'IA AG0QCAGCAG GAAQQGGAGG 
11461 GGAG3GAAGS GACAGGAAGA. GGCTAAGOCT TAAAAICAOG 1GGGAGLT1T ACAAAAIOOC 
U52i GJIUILITIT ' HJIUIUIUUC; TICTICACrr AGCATAA3GT CITO3QSCIT CATCOGTGTr 
U58i GTAAOGTGm TCAGAATTEA. TITIVJI ' ITIT AIG3CIGAAT CATAGIOCAG 'lUlUlUl'lLA 
ii64i TACATTTTGC TimiXAT K' A3GGATATOG GGALT1LT1U TAALYITIUG TITHEGAATA. 
ii70i ATOTIGCEAT GAACAAG3GT GTACAAATAT CIGCITGAGA. COL'IUL'ITIG TUAil'lTlGQG 
ii76i TAOCEAOOCA. GAAGTGGAAC TGGGGGAOCA. TGTGGTIATC CIGIGITEAA TmTlTlUA. 
ii82i GGAACCAGCA. TOCTAATTCT CACAQQQQCT GCATOGCTIC ACATTCDCAC CSGCAGCACA 
nasi CAGGGQCTOC A Gl'l'lUlU CA. CAICTTTOOC AK2ACTTATT TILTIL'IUIT TCAL'ICTCTC 
ii94i ' lUlClUlLTl ' TrnTITGAA GACAQ0G1CT TQL'IL'IUICA. TOCAGGCTG3 AGTOCAGIGG 
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12001 03CEA3CTIG QCTCACCACA AGC3CIG0CT GOCfiQGneA AGQGATICrC GCACCrCAGC 
12061 GCTOQGfOA CAGGAQGGIG GCaOCATOOC C^XTAATTT TTTIGQEAGA 

12121 csGQGfiTiCA. cxaaanaoc cagqciosic tcaaactgct gacctcaagt gatcc^coca 

i2iai ULTIUUUL'IC GCAAAGOQCT GQGATIGCfiG Q33TGAGCAC 03IGOOCAQC CMTTCIdT 
12241 'lULTlULTlU (X'lOJL'lUX' '1LXJLT1QCIT CJU1T1LT1UL ' TIULTILUIT ' lLTlTlLTlL ' 
12301 TIGAGACAftG GTCICfiCICC OflCACTAAG GCIGGAGAGC AGIGGCACAG TCACAGCTCA 
12361 CTOCAGGCIC AXTICCIQG GCT0QQGIGA. TICIGAGTAG CIGGCKTOCT GAGTAGCTGG 
12421 GACTACAGGC A1GIQCEACC A.TU LUULT AZTTnTIGT ATITITAAIA, GAGACMGGT 
12481 TTO30CKD3r TQCOCAAGCr QGACTTGAAC TOCT33QCIC AAGOGKPOGC ACIQ0Q00QG 
12541 OCTOCIGAAG TOCTAGGATT ACAG3CA1GA. GCXIACCATAC UlUUlL'ISKiT TITTTCIGTr 
12601 GTIQCIGTIT TIATAAIAGC CATICIAA3G GA3GIGAAGG GATATTTIGT TS lUlUmiT 
12661 TlTlTl'lCAT TrAITAICIT 'lTiiAliTlUAA TAGAAAGAAA QG33G'lUi!ATA. ATCAKITIGA. 
12721 CKEAGKEAAT TCT¥3EAGAT AAIATCAATC VLLATJ.T12AAG TOCATTCIGA AAACIGCTIG 
12781 TOGTTTIGAT AKXZAIGICT TEAAAGCAOC CCACTACATC ACAGICIGTG GOCAAAGTTG 
12841 AGGBCCAGCA TITAGACXnc: TCAATCCAQ3 GAAGALTITI 1 CTITSIGEAG CTCMGCTGG 
12901 QCTAGGIGTG ULTJLU1UGAG AKEGTfiGTIC A1T1UJAGCT CAGGQGTACT TQGQOCAGOC 
12961 QCIDQCTCX33 QCdTCICIG GTCAACAGIC TITIUIITLT AQ QGCTAAGA. C MM ' -nilUT 

13021 SaiTLOCAAAG TOGAOaCIC cicaagagat aaaacaaaag atccaactoc tjcaocagaa 
13081 GTCAGAGTIT GTOGAGAAGA. QCACAAAGTA ITlCltaGB T AGATOGGCTT GQGAGAAGAT 
13141 TGGAGGIGCA. 'lUL'lCALTlC CIOGCEAAGA. TOCACATAGC OCAGAGOOOC TCR LTIUULT 
13201 (JL'ILTIUJULJ TGGTCTIQCT GACCIGQZET CAAGCICTOC TO O gClHIi: CCIQGCIGAG 
13261 GGACCTAACT GCAGLTIL'IC TCIGCTOOCT TTOCCACATT TTA GAAACCC 'lULUl ' HJA GA 
13321 AA3GGAAAIG TICAAIIGG T3 AGTOCAQOQG TAA3GGTGIG TQCT3QCCIG UULjl ' lUl ' lUL ' 
13381 AGIGTIOCCT TGTGCIGTTG ACTIGAQQQG QOCTATTEAG AAGACAAAAA AAAAAACCAA. 
13441 ACAOCT3GAG CAAAGGTAGG AGAAAGGICA. TOGCAGGCOC OOCAQGCTCr GTOOGIGACT 
13501 CATIGACIGA GTTGACTCAT TAGACCACAG TCCQCAACAT GGOCIGGGTT GCIGGGAGGA 
13561 AGGGGATEAT ACGCAACATA. GCA3GCAGGG GOCTAAQCAG G33GTIGCIT UlUiTlULTl' 
13621 GTIGICAQGA CAGTGTAATT TAQOCDCTCT TAAIGCTAAT QCTCAGGAAT Tl ' lTlUUL ' iA 
13681 TUIGATITIT CI03GTAGTT OCAGAGCIGA. TTGGOGCTCA QGCACAJGCT GGTAAGIGGC 
13741 CAGATCAAGG CAAGIGGOOC TO3CCTQCTG GATCGCIGTG C1CICOXTA. OCACGTICCA 
13801 GAAGAACTAC UL'IUIUJL'IU TTIGCTGCAG GTGGGGAGAA OCCIGTAQGG ALLUl'lUUULA 
13861 TGGALLXJL'IA GCTAQGfEATT CAAATTTTCT TTGCAGTEAA TSTGA^'IL'IU GAJGCAGAAA 
13921 (JUULT1!AG0C CAAGCTCATC 'lTCIClXaAflXS AICIGAAGAG IGTIAGACTT QGAAACAAGT 
13981 QQGAGAQGCT GGCIGA3GGC OGGCAAAGAT TIGACAGCTG TK IUAI ' IUIT (JIUGGC' I CIC 
14041 GGAG1TIULT CIL'IQJLOJC CGTTACIGGG AGGIQGAGGT TGGAGACAAG ACAGCAIGGA, 
14101 ICCLGGGAGC CTGCAAGACA TCCATAAQCA QGAAAQQGAA CATGACICIG TOQCCAGAGA. 
14161 AJGGCTACIG GG'IGGIGATA. AJGAIGAAQG AAAATCAGTA. OCAQGCmUJ AGOJI'IGLIX! 
14221 OGAOOCOXT GCTAATAAAG GAGGCTOOCA AGGG 1 G1GGG Q jlLTimm GACTACAGAG 
14281 TIGGAAGCAT L'1U.T1T1!AC AAIGIGACAG GCRGATOOCA CAJCIATACA TTO3CCAGCT 
14341 QL'IL'ITIL'IL' TGUULUULTr CAAOCTATCT TCAiXXXTO G GACAGGIGAT QGAQQGAAGA. 
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14401 ACACAGCTOC TC3XaACTATC TGTGCAGIQG OIGGICAGOG QQCTCAqlGA| AIGOCCAaCA 

i446i ciGcarcrcr c tiul'iulti' ciqultiui' atcttgckit qmzacicaat AGicACGGaA. 

i4S2i T300GACEAG G1U,TAGC1G L'lMUJUAAA TGCMAAAATA ACAAAftTACT TA LVLU1UU QC 
14S81 AOBGAGODCr AULUGftTlM 1 AGCAGAQGTA AGTEAQGAAC GAACATGTEA GrICAATOCQG 
14641 GIGAAGACAT GiaL'lUflgjnjft. CACAGCATOG ATl'lUAGAQG AGGAAGTAGG gagill»tiuu 

14701 ataatcqgoc gjiuLHmjr qgcactcica gjigckctg aacagaagat ttg g ojl'iia 

14761 TlTlllULlA GAAOQCCAOG QCAAQGA!rAT fllUlUJULTl 1 GTIUIUIL'IU LTlL ' lUiUlT 
14821 GAGGATATOG GAAGQCIAGA GAAAQ3CAAG CAGACTGGAT TQQGATAGAA UEflClTlUlUr 
14881 AOCTGGATrA ATGAACTATC ATl ' l ' lTl ' m 1 Tl'mTlTlU AGACCAAATC T1UL ' 1L ' 1U113 
14941 GOOCAULjL'lU GAtjlULflGTG GCACGAlLL'IU AGL'HJAL'IUU AAtX'ICI^AOC TCCCAmi'H.' 
isool AAQ0GAT11T (JL'IUUL'IUAG (JL'IUL'IUAQC AJL'IULUjAT TACAGmUOG TGGCACCACA 
15061 QCAQGCIGGT T1 ' 1LT1U1!A T TITlflJrAGA GftLUUUmiT TCAGCAJGTT AGGCAQGCTG 
15121 GICTOSAACT OOiaOCTCA QSTOATCCAC (JUJUL'ICAGC CTOOCAAAGT GCU.ll-iAT.IA 
15181 CAQQCATGAG OCAC1U1ULU CUXITATGA TlLTl'l'l'l'lT T1T1TJ.'JLTJ.T TGAGACAAAG 
15241 ' ITl ' lUL ' lLTr GTCAOQCAQG CTGGAGIQCA UIGS1GCAAT LT1UUCTOGC AAeCTOOGOC 
15301 TOGCAajriU AAGAGATTCT OC'lUX'ltaG OCICOGAAGT AGCT3GGATT ACAGGOGGGC 
15361 GQCAOCATOC (JLUULTAATT T1T1ULATTT TEAOIAGACA TCAUxLTllA 'lUAim'lUGC 
15421 CAGUUUUmi: TCAAACTOCT GAGCTCAGGT GATOCAOOCA GCICAGOC'IC GCAAAGIGCA 
15481 QQGATTACAG QCATOAGOCA OCATGG0GQ3 (JLmUA!l'lLT TAAGAGAATT GAL'lUJULCT 
15541 CAJGAATAAA AAAATIAGAA AATCIU UICA TTIGCATITG TCACICAATC ACIGIGGAAT 
15601 GGCATnCGC GACIQCATIT NCAQGAAGIC AGATOQGACT ACIGTCATOG AAAAACATTT 
i566i GQQCATGTIA TTICCAAGIG TOKSATTATT C1U1LT1UGT TIGTATOQGA AAATCIGGQG 
15721 GTTGIGGAAT ATTAGGITCT ACTICACACA CATO0O3TOC A1T1U1UJ1T CATITAAAGA 
15781 GATOEAAAGG GGCDGGGC A T QGIGACICAC ATCIGEAATC TCAGCATTTT GGGAGGGAAA 
15841 Q30333TGGA TGQGCIGAGC CCAGQGATIG AGAGCAGCTG QGCAAIGIGG GGAAAAOCOG 
1S901 TCICEACAAA AAATACAAAA ATTAQCCATA GGGAMGQGQG T3QGAG3ATG GCTIGAGGGC 
15961 AGGAGAIOGA GSCIGCAGCA GTEGAACIGAG ^CIGCACTAC GQCAATOCAG OCIGGGCAflC 
16021 AGAGIGAGIC (JL'IUIL'IUZA AAAAGIGGAT GTEAQGAGTA CAAAAATGAA ATOAAGATEA 
16081 GATOCAAACT OOTATOGCAA CTOCICTGIC Tl tJ A L T ACIA GAGTGTAGAT TAGACICAGA 
16141 TACICCATOG CTATGAIGAG AQCAGGTAAA CTIGCIGGGC TTICCTOCAC GAGTITEATr 
16201 CEATAAGAGT AATOCACATC GCAGGACAGT TCACATGAGC TAOQQCITAG CIGITCECIG 
16261 CGGTGGGTCA T3TCTEATTC O0GATICICC CTIGTTATAA QLT1T1CATC AATATCTTIG 
16321 TCIKEATTIT OCACCAGCIC AGCATATACA TATTTnTIC TOCT3IGITA TTOCEAAAAT 
16381 GOTTOCT3AA TOIGAAATAT CIGATAATOC TTCCEfiGGQG TIGOCATACC ATOL'ITIUCA 
16441 AAGAHnTIA AAATATTICA TGOOCAAAGC AA3GACIGQC ATTEAAAATT TITi'lUL'lUA 
16501 TTTAATAGGG AT3TAATCAG QCCTTACTTC TCflTITATIT CATEAOCTGT TAATCAGQCT 
16561 GIGAATTTTT GCAT3IGAAT TTCIGCTTIT T3LT1CATTC TKTGGAAATT GEACAGl'lUJ 
16621 TTIGAATACT ' lUL ' l!A!iTlUG AATCEACATA TIGAATTTOG TGTTTIGCIG TACTTCCICA 
16681 TiaCATQGIT TTAGGCIGGG TOOGGTGCTC AOQGCTGAAA TOQCAACATT TIQQGAGOGG 
16741 GAQGIGGGCA GGATOQGTTG GCAATOGAGG GITIGGAGAC GGAQOCTSGG CAGACATOGC 
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16801 GftAflU JiUAJ CCICEWXTA GAAMSftlAAA CAAATEM30G CaGOCAATOG TOGIGAGCAC 
16861 CICTM3IOCT GICTM3BTIG A 
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